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acid sequence shown in SEQ !D NO:4. 

A prelerred isolated DNA according to the invention encodes a steroid receptor protein having the amino acid 
sequence shown in SEQ ID NO:5, SEQ ID NO;6, SEQ ID NO:21 or SEQ ID NO:25. 

A rrorc preferred isolated DNA according to the invention is an isolated DNA comprising a nucleotide sequence 
sr OAn ,n SEC ID NO: 1 , SEQ ID N0:2, SEQ ID NO:20 or SEQ ID NO:24. 

The DNA according to the invention may be obtained from cDNA. Alternatively, the coding sequence might be 
genomic DNA or prepared using DNA synthesis techniques. 

The DNA according to the invention will be very useful for in vivo expression of the novel receptor proteins according 
to the inveniion in sufticient quantities and in substantially pure form. 

In .motnot aspect of the invention there is provided for a steroid receptor comprising the amino acid sequence 
encoded by the above described DNA molecules. 

The '.icrod receptor according to the invention has an N-terminal domain, a DNA-binding domain and a ligand- 
bndnrj :>cxn..in wherein the amino acid sequence of said DNA-binding domain of said receptor exhibits at least 80% 
homoocTv vMir the ammo acid sequence shown in SEQ ID NO:3, and the amino acid sequence of said ligand-binding 
domain ot SHKj receptor exhibits at least 70% homology with the amino acid sequence shown in SEQ ID NO:4. 

in p^fticuUf ;hc steroid receptor according to the invention has an N-terminal domain, a DNA-binding domain and 
a iignno Dinainq donain wherein the ammo acid sequence of said DNA-binding domain of said receptor exhibits at 
least 90". pfcicrabiy 95°o. more preferably 98%, most preferably 100% homology with the amino acid sequence 
shown in SEC ID NO 3 

Mofu pnrt cular, the steroid receptor according to the invention has an N-terminal domain, a DNA-binding domain 
and a liL>ind binding domain, wherein the amino acid sequence of said ligand-binding domain of said receptor exhibits 
at least prefearbly 30%, more preferably 907o. most preferably I007o homology with the amino acid sequence 
shown in SEQ ID NO 4 

It will be clear for those skilled in the art that also steroid receptor protems comprising combined DBD and LBD 
prcfcrcnccG and DNA encoding such receptors are subject of the invention. 

Preferably the steroid receptor according to the invention comprises an amino acid sequence shown in SEQ ID 
N0:5, SEQ ID NO:6, SEQ ID NO:21 or SEQ ID NO;25. 

Also within the scope of the present invention are steroid receptor proteins which comprise variations in the amino 
acid sequence of the DBD and LBD without loosing their respective DNA-binding or ligand-binding activities. The 
variations that can occur in those amino acid sequences comprise deletions, substitutions, insertions, inversions or 
additions of (an) amino acid(s) in said sequence, said variations resulting in amino acid difference(s) in the overall 
sequence. It is well known in the art of proteins and peptides that these amino acid differences lead to amino acid 
sequences that are different from, but still homologous with the native amino acid sequence they have been derived 
from. 

Amino acid substitutions that are expected not to essentially alter biological and immunological activities, have 
been described in for example Dayhof, IVI.D., Atlas of protein sequence and structure, Nat. Biomed. Res. Found., 
Washington D.C, 1978. vol. 5, suppl. 3. Amino acid replacements between related amino acids or replacements which 
have occurred frequently in evolution are, inter alia Ser/Ala, Ser/Gly, Asp/Gty, Arg/Lys, Asp/Asn, lle/Val. Based on this 
information Lipman and Pearson developed a method for rapid and sensitive protein comparison (Science 227, 
1435-1441, 1985) and determining the functional similarity between homologous polypeptides. 

Variations in amino acid sequence of the DBD according to the invention resulting in an amino acid sequence that 
has at least 80%. homology with the sequnece of SEQ ID NO:3 will lead to receptors still having sufficient DNA binding 
activity. Variations in amino acid sequence of the LBD according to the invention resulting in an amino acid sequence 
that has at least 70% homology with the sequnece of SEQ ID NO:4 will lead to receptors stiil having sufficient ligand 
binding activity 

Homology as defined herein is expressed in percentages, determined via PCGENE. Homology is calculated as 
the percentage of identical residues in an alignment with the sequence according to the invention. Gaps are allowed 
to obtain maximum alignment. 

Comparing the amino acid sequences of the classical ER and the ER's according to the invention revealed a high 
degree of similarity within their respective DBD's. The conservation of the P-box (amino acids E-G-X-X-A) which is 
responsible for the actual interactions of the classical ER with the target DNA element (Zilliacus et al., MoI.Endo. 9, 
389, 1995; Glass, End. Rev. 15, 391, 1994), is indicative for a recognition of estrogen responsive elements (ERE's) by 
the ER's according to the invention. The receptors according to the invention indeed showed ligand-depondcnt trans- 
activation on ERE-containing reporter constructs. Therefore, the classical ER and the novel ER's according to the 
invention may have overlapping target gene specificities. This could indicate that in tissues which co-express both 
respective ER's, these receptors compete for ERE's. The ER's according to the invention may regulate transcription 
of target genes differently from classical ER regulation or could simply block classical ER functioning by occupying 
estrogen responsive elements. Alternatively, transcription might be influenced by heterodimerization of the different 
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receptors. 

Thus, PI preferred steroid receptor according to the invention comprises the annino acid sequence E-G-X-X-A within 
the P box of the DNA binding domain, wherein X stands for any amino acid. Also within the scope of the invention is 
isolated DNA encoding such a receptor, 

5 Methods to prepare the receptors according to the invention are well kriown in the art (Sambrook et al., Molecular 

Cloning: a Laboratory Manual, Cold Spring Harbor Laboratory Press. Cold Spring Harbor, 1939). The most practical 
approach is to produce these receptors by expression of the DNA encoding the desired protein. 

A wide variety of host cell and cloning vehicle combinations may be usefully employed in cloning the nucleic acid 
sequence coding for the receptor of the invention. For example, useful cloning vehicles may include chromosomal, 

'0 non-chromosomal and synthetic DNA sequences such as various known bacterial plasmids and wider host range 
piasmids and vectors derived from combinations of plasmids and phage or virus DNA. Useful hosts may include bac- 
terial hosts, yeasts and other fungi, plant or animal hosts, such as Chinese Hamster Ovary (CHO) cells or monkey 
cells and other hosts. 

Vehicles tor use in expression of the ligand-binding domain of the present invention will further comprise control 

'5 sequences operably linked to the nucleic acid sequence coding for the ligand-binding domain. Such control sequences 
generally comprise a promoter sequence and sequences which regulate and/or enhance expression levels. Further- 
more an origin of replication and/or a dominant selection marker are often present in such vehicles. Of course control 
and other sequences can vary depending on the host cell selected. 

Techniques for transforming or transfeciing host cells are quite known in the art (see, for instance, Sambrook et 

20 ai , Molecular Cloning; A Laboratory Manual, Cold Spring Harbor Laboratory, 1989). 

Recombinant expression vectors comprising the DNA of the invention as well as cells transformed with said DNA 
or said expression vector also form part of the present invention. 

In a further aspect of the invention, there is provided for a chimeric receptor protein having an N-terminal domain, 
a DNA-binding domain, and a ligand-binding domain, characterized in that at least one of the domains originates from 

25 a receptor protein according to the invention, and at least one of the other domains of said chimeric protein originates 
from another receptor protein from the nuclear receptor superfamity, provided that the DNA-binding domain and the 
ligand-binding domain of said chimeric receptor protein originate from different proteins. 

In particular, the chimeric receptor according to the invention comprises the LBD according to the invention, said 
LBD having an amino acid sequence which exhibits at least 70% homology with the amino acid sequence shown in 

30 SEQ ID NO:4. In that case the N-terminal domain and DBD should be derived from another nuclear receptor, such as 
for example PR. In this way a chimeric receptor is constructed which is activated by a ligand of the ER according to 
the invention and which targets a gene under control of a progesterone responsive element. The chimeric receptors 
having a LBD according to the invention are useful for the screening of compounds to identify novel ligands or hormone 
analogs which are able to activate an ER according to the invention. 

35 In addition, chimeric receptors comprising a DBD according to the invention, said DBD having an amino acid 

sequence exhibiting at least 80% homology with the amino acid sequence shown in SEQ ID NO:3, and a LBD and, 
optionally, an N-terminal domain dertved f rom another nuclear receptor, can be succesf ully used to identify novel ligands 
or hormone analogs for said nuclear receptors. Such chimeric receptors are especially useful for the identification of 
the respective ligands of orphan receptors. 

-^o Since steroid receptors have three domains with different functions, which are more or less independent, it is 

possible that all three functional domains have been denved from different members of the steroid receptor superfamily. 

Molecules which contain parts having a different origin are called chimeric. Such a chimeric receptor comprising 
the ligand-binding domain and/or the DNA-binding domain of the invention may be produced by chemical linkage, but 
most preferably the coupling is accomplished at the DNA level with standard molecular biological methods by fusing 

■i5 the nucleic acid sequences encoding the necessary steroid receptor domains. Hence, DNA encoding the chimeric 
receptor proteins according to the invention are also subject of the present invention. 

Such chimeric proteins can be prepared by transfecting DMA encoding these chimeric receptor proteins to suitable 
host cells and culluring these cells under suitable conditions. 

It is extremely practical if, next to the information for the expression of the steroid receptor, also the host cell is 

50 transformed or transfected with a vector which carries the information for a reporter molecule. Such a vector coding 
for a reporter molecule is characterized by having a promoter sequence containing one or more hormone responsive 
elements (HRE) functionally linked to an operative reporter gene. Such a HRE is the DNA target of the activated steroid 
receptor and, as a conscquoncc, it enhances the transcription of the DNA, coding for the reporter molcculo. In in vivo 
settings of steroid receptors the reporter molecule comprises the cellular response to the stimulation of the ligand. 

55 However, it is possible in vitro to combine the ligand-binding domain of a receptor to the DNA binding domain and 
transcription activating domain of other steroid receptors, thereby enabling the use of other HRE and reporter molecule 
systems. One such a system is established by a HRE presented in the MMTV-LTR (mouse mammary tumor virus long 
terminal repeat sequence in connection with a reporter molecule like the firefly luciferase gene or the bacterial gene 
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for CAT (chloramphenicol transferase). Other HRE's which can be used are the rat oxytocin promotor, the retinoic acid 
responsive element, the thyroid hormone responsive element, the estrogen responsive element and also synthetic 
responsive elements have been described (for instance in Fuller, ibid, page 3096), As reporter molecules nexi to CAT 
and luciferase |5-galactosidase can be used. 

5 Steroid hormone receptors and chimeric receptors according to the present invention can be used for the in vitro 

identification of novel ligands or hormonal analogs. For this purpose binding studies can be performed with cells trans- 
formed with DNA according to the invention or an expression vector comprising DNA according to the invention, said 
cells expressing the steroid receptors or chimeric receptors according to the invention. 

The novel steroid hormone receptor and chimeric receptors according to the invention as well as the ligand-binding 

w domain of the invention, can be used in an assay for the identification of functional ligands or hormone analogs for the 
nuclear receptors. 

Thus, the present invention provides for a method for identifying functional ligands for the steroid receptors and 
chimeric receptors according to the invention, said method comprising the steps of 

15 a) introducing into a suitable host cell 1 ) DNA or an expression vector according to the invention, and 2) a suitable 

reporter gene functionally linked to an operative hormone response element, said HRE being able to be activated 
by the DNA-binding domain of the receptor protein encoded by said DNA; 

b) bringing the host cell from step a) into contact with potential ligands which will possibly bind to the ligand^inding 
domain of the receptor protein encoded by said DNA from step a); 
20 c) monitoring the expression of the receptor protein encoded by said reporter gene of step a). 

If expression of the reporter gene is induced with respect to basic expression (without ligand), the functional ligand 
can be considered as an agonist; if expression of the reporter gene remains unchanged or is reduced with respect to 
basic expression, the functional iigand can be a suitable (partial) antagonist. 

25 For pertorming such kind of investigations host cells which have been transformed or transfectod with both a vector 

encoding a functional steroid receptor and a vector having the information for a hormone responsive element and a 
connected reporter molecule are cultured in a suitable medium. After addition of a suitable ligand, which will activate 
the receptor the production of the reporter molecule will be enhanced, which production simply can be determined by 
assays having a sensitivity for the reporter molecule. See for instance WO-A-88031 68. Assays with known steroid 

30 receptors have been described (for instance S. Tsai et al.. Cell 57, 443, 1989; M. Meyer et aL Cell 57. 433. 1989). 

Legends to the figures 
Figure 1. 

35 

Northern analysis of the novel estrogen receptor (ERp). Two different multiple tissue Northern blots (Clontech) 
were hybridised with a specific probe for ERp (see examples). Indicated are the human tissues the RNA ongtnated 
from and the position of the size markers in kilobases (kb). 

40 Figure 2. 

Histogram showing the 3- to 4-foid stimulatory effect of 17p-estradiol, estriol and estrone on the luciferase activity 
mediated by Enp. An expression vector encoding ERp was transiently transfected into CHO cells together with a re- 
porter construct containing the rat oxytocin promoter m front of the firefly luciferase encoding sequence (see examples). 

45 

Figure 3. 

Effect of 17p-eslradiol (E2) alone or in combination with the anti-estrogen ICM 64384 (ICl) on ERa and ER[1 
Expression constructs for ERa (the classical ER) and ER{i were transiently transfected into CHO ceils together with 
50 the rat oxytocin promoter-luciferase reporter construct descnbed in the examples. Luciferase activities were determined 
in triplicate and normalised for transfection efficiency by measuring |i-galactosidase in the same lysate. 

Figure 4. 

55 Expression of ERa and ERfJ in a number of cell lines determined by RT-PCR analysis (see examples). The cell 

lines used were derived from different tissues/cell types: endometrium (ECC1, Ishikawa. HEC-IA, RL95-2); osteosa- 
rcoma (SAOS-2. U2-OS, HOS, WG63): breast tumours (MCF-7, T47D), endothelium (HUV-EC-C, BAEC-1); smooth 
muscle (HISIV1, PAC-1 , A7R5. AlO, RASfvIC, CavaSMC); liver (HepG2); colon (CaCo2); and vagina (Hs-760T. SW-954). 
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Ail cell lines were human except for PAC-1, A7R5, AlO and RASMC which are of rat origin, BAEC-1 which is of 
bovine origin and CavaSMC which is of guinea pig origin 

Figure 5 

Transactivation assay using stably iransfected CHOce!! lines expressing ERaor ER[itogether with the rat oxytocin- 
luciferase estrogen-responsive reporter (see examples for details). Hormone-dependent transactivation curves were 
determined for 171i-eslradiol and for Org4094. For the ER antagonist raloxifen, cells were treated with 2 x lO'i mol/ 
L l7|i-estradiol together with increasing concentrations of raloxifen. Maximal values of the responses were arbitrarily 
set at 100%. 

Examples 

A. Molecular cloning of the novel estrogen receptor. 

Two degenerate oligonucleotides containing inosines (I) were based on conserved regions of the DNA-binding 
domains and the ligand-binding domains of the human steroid hormone receptors. 

Primer #1 : 

5' -GGIGA (C/T) GA (A/G) GC (A/T) TCIGGITG (C/T) CA (C/T) TA(C/T) GG-3' 
{SEQ ID NO: 7) . 



Primer #2: 

5'-AAGCCTGG(C/G)A(C/T)IC(G/T) (C/T) TTIGCCCAI (C/T) TIAT-3' SEQ 
ID NO: 8) . 

As template, cDNAfrom human EBV-sttmulated PBLs (peripheral blood leukocytes) was used. One microgram of 
total RNA was reverse transcribed in a 20 ^1 reaction containing 50 mM KCI. 10 mM Tris-HCI pH 8.3, 4 mM MgCt2, 1 

35 mM dNTPs (Pharmacia), 100 pmol random hexanucleotides (Pharmacia). 30 Units RNAse inhibitor (Pharmacia) and 
200 Units M-MLV Reverse transcriptase (Gibco BRL). Reaction mixtures were incubated at 37°C for 30 minutes and 
heat-inactivated at 100'C for 5 minutes. The cDNA obtained was used in a 100 \i\ PGR reaction containing 10 mM 
Tris-HCI pH 8.3, 50 mM KCI, 1.5 mM MgC12, 0.001% gelatin (wA/), 3% DMSO. 1 microgram of primer #1 and primer 
#2 and 2 5 Units of Amplitaq DNA polymerase (Perkin Elmer). PGR reactions were performed in the Perkin Elmer 9600 

40 thermal cycler. The initial denaturation (4 minutes at 94»G) was followed by 35 cycles with the following conditions: 30 
sec 94°C 30 sec 45°C. 1 minute 72°G and after 7 minutes at 72"C the reactions were stored at 4'C. Aliquots of these 
reactions were analysed on a 1 .5% agarose gel. Fragments of interest were cut out of the gel. reamplified using identical 
PGR-conditions and purified using Qiaex il (Oiagen). Fragments were cloned in the pCRlt vector and transformed into 
bacteria using the TA-cloning kit (Invitrogen). Plasmid DNA was isolated for nucleotide sequence analysis using the 

45 Qiagen plasmid midi protocol (Qiagen). Nucleotide sequence analysis was performed with the ALF automatic se- 
quencer (Pharmacia) using a T7 DNA sequencing kit (Pharmacia) with vector-specific or fragment-specific primers. 

One cloned fragment corresponded to a novel estrogen receptor (ER) which is closely related to the classical 
estrogen receptor Part of the cloned novel estrogen receptor fragment (nucleotides 466 to 797 in SEQ ID 1) was 
amplified by PGR using oligonucleotide #3 TGTTACGAAGTGGGAATGGTGA (SEQ ID NO;9) and oligonucleotide ^^2 

50 and used as a probe to screen a human testis cDNA library in Xgtl 1 (Clontech #HLl0l0b). Recombinant phages were 
plated (using Y1 090 bacteria grown in LB medium supplemented with 0.2% maltose) at a density of 40.000 pfu (plaque- 
forming units) per 135 mm dish and replica filters (Hybond-N. Amersham) were made as described by the supplier. 
Filters were prehybridised in a solution containing 0.5 M phosphate buffer (pH 7.5) and 7% SDS at 55'G for at least 
30 minutes. DNA probes were purified with Qiaex !l (Qiagen), 32p-iabGled with a Decaprime kit (Ambion) and^added 

55 to the prehybridisation solution. Filters were hybridised at 65°G ovemight and then washed in 0.5 X SSG/0,1% SDS 
at es^C. Two positive plaques were identified and could be shown to be identical. These clones were purified by 
rescreening one more time. A PGR reaction on the phage eluates with the Xgt 11 -specific pnmers #4: 5'-TTGAGAC- 
CAGACCAACTGGTAATG-3' (SEQ ID NO:10) and #5: 5'-GGTGGCGACGACTCGTGGAGGGCG-3* (SEQ ID NO:11) 
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yielded a fragment of 1700 basepairs on both clones. 

Subsequent PGR reRCtions using combinations of a gene-specific primer #6: 5'-GTACACTGATTTGTAGCTGGAC-3' 
(SEQ ID NO: 12) with the /.gtll primer #4 and gene-specific primer #7: 5'-CCATGATGATGTCCCTGACC-3* (SEQ ID 
NO: 13) with Xgtll pnmer primer #5 yielded fragments of approximately 450 bp and lOOO bp. respectively, which were 
cloned in the pCRIl vector and used for nucleotide sequence analysis. The conditions for these PGR reactions were 
as described above except for the primer concentrations {200 ng of each primer) and the annealing temperature (60' C). 
Since in the cDNA clone the homology with the ER is lost abruptly at a site which corresponds to the exon 7/exon 8 
boundary in the ER (between nucleotides 1247 and 1248 in SEQ ID NO:1), it was suggested that this sequence cor- 
responds to intron 7 of the novel ER gene. For verification of the nucleotide sequences of this cDNA clone, a 1 200 bp 
fragment was generated on the cDNA clone with /.gtll primer #4 with a gene-specific primer #8 corresponding to the 
3' end of exon 7: 5'-TCGCATGCCTGACGTGGGAC-3' (SEC ID NO:14) using the proofreading Pfu polymerase (Strat- 
agene). This fragment was also cloned in the pCRIl vector and completely sequenced and was shown to be identical 
to the sequences obtained earlier. 

To obtain nucleotide sequences of the novel ER downstream of exon 7, a degenerate oligonucleotide based on 
the AF-2 region of the classical ER (#9: 5'-GGC(C/G)TCCAGCATCTCCAG(C/G)A(A/G)GAG-3'; SEQ ID NO:15) was 
used together with the gene-specitic oligonucleotide #10: 5'-GGAAGGTGGCTCAGTTGCTG-3' (SEQ ID NO: 15) using 
testis cDNA as template (Marathon ready testis cDNA, Clontech Cat #741 4-1 ). A specific 220 bp fragment correspond- 
ing to nucleotides 1112 to 1332 in SEQ ID No. 1 was cloned and sequenced. Nucleotides 1112 to 1247 were identical 
to the corresponding sequence of the cDNA clone. The sequence downstream thereof is highly homologous with the 
corresponding region In the classical ER, In order to obtain sequences of the novel ER downstream of the AF-2 region, 
RACE (rapid amplification of cDNA ends) PGR reactions were performed using the Marathon-ready testis cDN A (Clon- 
tech) as template. The initial PGR was performed using oligonucleotide #11 : 5'-TCTTGTTCTGG ACAGGGATG-3' (SEQ 
ID NO:17) in combination with the API primer provided in the kit. A nested PGR was performed on an aliquot of this 
reaction using oligonucleotide #10 (SEQ ID NO:l6) in combination with the oligo dT primer provided in the kit. Subse- 
quently, an aliquot of this reaction was used in a nostod PGR using oligonucleotide #12: 5'-GCATGGAACATCTGCT- 
CAAC-3' (SEQ ID NO: 18) in combination with the oligo dT primer. Nucleotide sequance analysis of a specific fragment 
that was obtained (corresponding to nucleotides 1256 to 1431 in SEQ ID NO 1) revealed a sequence encoding the 
carboxyterminus of the novel ER ligand-binding domain, including an F-domain and a translational stop codon and 
part of the 3' untranslated sequence which is not included in SEQ ID NO:1. The deduced amino acid sequence is 
shown in SEQ ID NO:5, 

In order to investigate the possibility that the novel estrogen receptor had additional, upstream translation-initiation 
codons, RACE-PCR experiments were performed using Marathon-ready testis cDNA (Clontech Cat. # 7414-1), First 
a PGR was performed using oligonucleotide SEQ ID NO:12 (antisense corresponding to nucleotides 416-395 in SEQ 
ID N0:1) and AP-1 (provided in the kit). A nested PGR was then performed using oligonucleotide having SEQ ID NO: 
27 (antisense corresponding to nucleotides 254-231 in SEQ ID NO:1) with AP-2 (provided in the kit). From the smear 
that was obtained, the region corresponding to fragments larger than 300 basepairs was cut out, purified using the 
Genecleanll kit (BiolOl) and cloned using the TA-cloning kit (Clontech). Colonies were screened by PGR using gene- 
specific primers: SEQ ID NO:22 and SEQ ID NO:28. The clone containing the largest insert was sequenced. The 
nucleotide sequence corresponds to nucleotides 1 to 490 in SEQ ID NO:24. It is clear from this sequence that the first 
in-frame upstream translation initiation codon is present at position 77-79 in SEQ ID NO:24. Upstream of this transla- 
tional startcodon an in-frame stop-codon is present (11-13 in SEQ ID NO:24). Consequently, the reading frame of the 
novel estrogen receptor is 530 amino acids (shown in SEQ ID NO:25) and has a catculated molecular mass of 59.234 
kD. 

To contirm the nucleotide sequences obtained by 5' RACE, human genomic clones were obtained and analysed. 
A human genomic library in XEMBL3 (Clontech HL1067J) was screened with a probe corresponding to nucleotides 1 
to 41 6 in SEQ ID NO: 1 . A strongly hybridizing clone was plaque-purified and DNA was isolated using standard protocols 
(Sambrook et al, 1 989). The DNA was digested with several restriction enzymes, electrophoresed on agarose gel and 
blotted onto Nylon fiilers. Hybridisation of Ihe blot with a probe corresponding to the above-mentioned RACE fragment 
(nucleotides 1-490 in SEQ ID NO:24) revealed a hybridizing Sau3A fragment of approximately 800 basepairs. This 
fragment was cloned into the BamHl site of pGEM3Z and sequenced. The nucleotide sequence contained one base 
difference which is probably a PCR-induced point mutation in the RACE fragment. Nucleotide 172 was a G residue in 
the 5'RACE fragment, but an A residue In several independent genomic subclones. 

S. Identification of two splice variants of the novel estrogen receptor. 

Rescreening of the testis cDNA library with a probe corresponding to nucleotides 918 to 1246 in SEQ ID No. 1 
yielded two hybridizing clones, the 3' end of which were amplified by PGR (gene-specific primer #10: 5'-GGAAGCT- 
GGCTCACTTGCTG-3' (SEQ ID NO: 16) together with primer #4, SEQ ID NO:10), cloned and sequenced. One clone 
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D f,T.PCR analysis of. xpress,on of BRa an HP o„..o, B (Cinna«iotecx). cDNA was r.ade 

■ ,...s,solated,romanumhero,h_^^^^^ 

usrng a 5 microgram o. total "NA using the S P^^^^^^^ ^, ,^^^^,,,3 correspond.g e e < -B .^t se- 

o the cDNA was used for spec c PGR ampli „ pr'^js ^ ed ^' '.a ^^^^ ^^^^ ^p^. 

labeled PGR fragments generated 

CellcuMS ^ ,nr-, mi and maintained at 3rC in a humidilied 

,rMnK1l cells were obtained Irom ATCC (^^^61) m- ^^,3,3,3 of a mixture 

Chinese HamsterOvary(CHC^Ki e^^^^^^ 

atmosphere (5% CO,) -.^ (DMEM, Gibco 074-200 a J^^ 'nen^ (p^^a), 2.3 ,g/ml ^ 

,11) of Dulbecco's Modified Eagles Me ^ donate (Baker), 55 MQ/ml soam py ^^^.^^ 3g|g„„e 

'oUlf700) supplement 



50 



55 



S^^^^^^^^^^^'^^ , . ,,0 ID NO 1 was amplified by PGR using oligonucleotides S'- 

T.eEBMncoding sequence as presented in SEO ID NO. 
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mammalian cell expression vector pNGVl (Genbank accession No^X992J4)^ ^^^^ 

An expression consume, enc^ing me ERPrea^^^^^ 
^fsre" t?o\So"^T: T"t«^^^^^ SEO ,D .0:26 ,n com.na.ion w,. SEO ,0 NO: 

PR usina the above mentioned 5' RACE fragment. oco/^i^; a HinHtll/ Mbol 

1ne^eponer vector was based on oxy-in gene re^^^^^^^^^^^ ^.h iSlclrrs::^.^ 

fragment: R.lvell, and D.Richter, Proc.Natl.Acad.Sc,.USA81 2006 201°^^^°'* ' ^^^^^^^ response 

sequence: the regulatory region ol the oxytocin gene was showr, to possess "/^^^^^^^^^^^^ ,,^'3, 
elements in .to for both the rat (R Adan et al. Biochem.Biophys Res.Comm^ JTS. 117-122. 1991) 
RK:hard and H Zingg, J Biol.Chem 265, 60S8-6103, 1990). 



Transient iransteclion 



at 37-0 cells were washed twice with fenolred-free M505 + ^'^ j''^''^"^ 7 Ce extracts were made 48 hours 

overnight a. 37»C. Alter 24 hours hormones were added .0 t e med.um (10 7^^^^^^ 

posttransfecion by the addition of 200 ^ lysisbuffer (0. M f^^) and 20 ,L sample was 



for 10 sec at 562 nm. 
Stable transfection of the pnv^l pstroaen receptor. 



tested for a response to ^J^'f'^^^^ :, n 6x1% cells per well) After 24 hours different concentrations of hormone 
microliter) to each well and light emission was measured using the Topcount (Packard). 



Results. 



/ceo in MO-1 and SFQ ID NO 24) in transient transtections in 

T°T ^TcMsSrllrS" b. .» anl.s.oi.1 to. bo,h EHa ER( -h ,7|i.=,,^d<,l, 

orde. lo nomializ. lo. dill..«ie.> In l.an.l.cloo .fci.rey , 

Tran^ctivatlon s.c^^^^^^^^^^ , ..er 

rcTntLrs^rhi;:!;::^:' aTc^pared ,o ER, (F^ure S). For Org4094 ,h,s ,s also the case however, the 
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>unced The curves for raloxifen show that the potency of this antagonist 
»mpared to its potency to block ERj^ transactivation 



10 



15 



20 



25 



30 



35 



40 



45 



50 
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SEQUENCE LISTING 

5 

(1) GENERAL INFORMATION: 
(i) APPLICANT: 

10 

(A) NAME: Akzo nobel n.v. 

(B) STREET: Velperweg 76 

(C) CITY: Arnhem 

^5 (E) COUNTRY: The Ketherlanda 

(F) POSTAL CODE UIP) : 6824 BM 

(G) TELEPHONE: 0412-666379 

(H) TELEFAX: 0412-650592 

(I) TELEX: 37503 aiqpha nl 

(li) TITLE or INVENTION: Novel estrogen receptor 

(ill) NUMBER OF SEQfUENCES: 26 

(Iv) COMPUTER READABLE FORM: 
30 (A) MEDIUM TYPE: Floppy diak 

(B) COMPUTER: IBM PC cotapetible 

(C) OPERATING SYSTSl: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Releaae #1.0, Vcraion #1.30 (EPO) 

35 

(2) INFORMATION FOR SEQ ID NO: 1: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1434 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

45 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: cDNA 

50 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

55 
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ATGAATTACA GCATTCCCAG CAATGTCACT AACTTGGAAG GTGGGCCTGG TCGGCAGACC 
AGAAGCCCAA ATGTGTTGTG GCCAACACCT GGGCACCTIT CTCCTTTAGT GGTCCATCGC 
CAGTTATCAC ATCTGTATGC GGAACCTCAA AAGAGTCCCT GGTGTGAAGC AAGATCGCTA 
GAACACACCT TACCTGTAAA CAGAGAGACA CTGAAAAGGA AGGTTAGTGG GAACCGTTGC 
GCCAGCCCTG TTACTGGTCC AGGTTCAAAG AGGGATGCTC ACTTCTGCGC TGTCTGCAGC 
GATTACGCAT CGGGATATCA CTATGGAGTC TGGTCGTGTG AAGGATGTAA GGCCTTTTTT 
AAAAGAAGCA TTCAAGGACA TAATGATTAT ATTTGTCCAG CTACAAATCA GTGTACAATC 
GATAAAMCC GGCGCAAGAG CTGCCAGGCC TGCCGACTTC GGAACTGTTA CGAAGTGGGA 
ATGGTGAAGT GTGGCTCCCG GAGAGWMA TGTGGGTACC GCCTTGTGCG GAfiACA^GA 
AGTGCCGACG AGCAGCTGCA CTGTGCCGGC AAGGCCAAGA GAAGTGGCGG CCACGCGCCC 
CGAGTGCGGG AfiCTGCTGCT GGACGCCCTG AGCCCCGAGC AGCTACTGCT CACCCTCCTG 
GAGGCTGAGC CGCCCCMGr GCTGATCAGC CGCCCCAGTG CGCCCTTCAC CGAGGCCTCC 
ATGATGATGT CCCTGACCAA GTTGGCCGAC AAGGAGTTGG TACACATGAT CAGCTGGGCC 
AAGAAGATTC CCGGCTTTGT GGAGCTCAGC CTGTTCGACC AAGTGCGGCT CTTGGAGAGC 
TGTTGGATGG AGGTGTTAAT GATGGGGCTG ATGTGGCGCT CAATTGACCA CCCCGGCAAG 
CTCATCTTTG CTCCAGATCT TGTTCTGGAC AGGGATGAGG GGAAATGCGT AGAAGGAATT 
CTGGAAATCT TTGACATGCT CCTGGCAACT ACTTCAAGGT TTCGAGAGTT AAAACTCCAA 
CACAAAGAAT ATCTCTGTGT CAAGGCCATG ATCCTGCTCA ATTCCACTTAT GTACCCTCTG 
GTCACAGCGA CCCAGGATGC TGACAGCAGC CGGAAGCTGG CTCACTTGCT GAACGCCGTG 



60 

120 

180 

240 

300 

360 

420 

480 

540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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1434 



ACCGATGCTT TGCTTTGeCT GATTGCCAAC AGCGGCATCT CCTCCCAGCA GCAATCCATG 1200 

CGCCTGGCTA ACCTCCTGAT GCTCCTCTCC CACGTCAGGC MGCGAGTAA CAAGGGCATG 1260 

GAACATCTGC TCAACATGAA GTGCAAAAAT GTGGTCCCAG TGTATGACCT GCTGCTGGAG 1320 

ATGCTGAATG CCCACGTGCT TCGCGGGTGC AAGTCCTCCA TCACGGGGTC CGAGTGCAGC 1380 
CCGGCAGAGG ACAGTAAAAG CAAAGAGGGC TCCCAGAACC CACAGTCTCA GTGA 
(2) INFORMATION TOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LEHSTH: 1251 baae pairs 

(B) TYPE: nucleic acid 

(C) 3TRAHDEDHES3: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDHA 



60 



<xi) SBQUEMCE DESCRIPTION: SBQ ID NO: 2: 
ATGAATTACA GCATTCCCAG CAATGTCACT AACTTGGAAS GTGGGCCTGG TCGGCAGMX 

ACAAGCCCAA ATGTGTTGTG GCCAACACCT GGGCACCTTT CTCCTTTAGT GGTCCATCGC 120 

CAGTTATCAC ATCTGTATGC GGAACCTCAA AAGAGTCCCT GGTGTGAAGC AAGATCGCTA 180 

GAACACACCT TACCTGTAAA CAGAGAGACA CTGAAAAGGA AfiGTTAGTGG GAACCGTTGC 240 

GCCAGCCCTG TTACTGGTCC AGGTTCAAAG AGGGATGCTC ACTTCTGCGC TGTCTGCAGC 300 

GATTACGCAT CGGGATATCA CTATGGAGTC TGGTCGTGTG AAGGATGTAA GGCCTTTTTT ''360 

AAAAGAAGCA TTCAAGGACA TAATGATTAT ATTTGTCCAG CTACAAATCA GTCTACAATC 420 

GATAAAAACC GGCGCAAGAG CTGCCAGGCC TGCCGACTTC GGAAfiTGTTA CGAACTGGGA 480 
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~o — — - 

™. «^=- -^^^ 

..^K.^^ TACACATGW CAGCTGGGCC ''BO 
ATC.TO.TGT CCCTOKCOVA ^GGCCG^C .AGGA«TT« TACAO. 

„^ ^CAGC CTGTTCGACC AAGTGCGGCT CTTGGAfl«5C 840 
;,,,akM.TTC CCGGCTTTCT GGACCTCAGC CI 

»,.rTfiGCSCT CAATTGACCA CCCCGGCAAG 900 
TGTTGOATGG AGOTGTTAAT C5ATCGGGCTG ATGTGGCGCT CAATT 

AGGGATGAGG GGAAATGCGT AGAAfiGAAIT 96° 
CTCATCmG CTCCAfiATCr TGTTCTGGAC AGGGATGAGG 

XCTTCAACCT TTCGAfiACTT AAAACTCCAA 1020 
CT^^CT TTGACAIGCT CCTGCCAACT ACPICAAGCT 

™.A.™CAAGGCCA.GA.CCTGCTCA^^^^ ^ 

.CACAGCGACCCAGG^GC^CGGAAGCrGGC^^ - 

,«CTTTGG<^ C^^TGCCAAG AGCGGCATCT CCTCC^ 

ACCGATGCTT TGGTTTGGGT GArT.^»vA-r«^ 

1251 

CCCCX^. ..CCTCCT^T OCCCT^ ' 

(2) IKJORMATION EX)R SEQ ID HO: 3: 

(i) SEQUENCE characteristics: 

(A) LENGTH: 66 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear % 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

cys Ala Val Cys Ser Asp Tyr Ala Ser Gly Tyr His Tyr Gly Val Trp 

1 5 10 15 

Ser Cys Glu Gly Cys Lys Ala Phe Phe Lya Arg Ser He Gin Gly His 
20 25 30 

Asn Asp Tyr He Cys Pro Ala Thr Asn Gin Cys Thr He Asp Lys Asn 
35 40 <5 

Arg Arg Lys Ser Cys Gin Ala Cys Arg Leu Arg Lys Cys Tyr Glu Val 
50 55 60 

Gly Het 
65 

INFORMATIOM FOR SEQ ID MO: 41: 

(i) 3BQ0EKCE CHARACTERISTICS! 

(A) LEWGTH: 233 a»lno acids 

(B) TYPE: amino acid 

(CI STRANDEDHESS; single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 4: 

Leu Val Leu Thr Leu Leu Glu Ala Glu Pro Pro His Val Leu He Ser 

1 5 10 15 

Arg Pro Ser Ala Pro Phe Thr Glu Ala Ser Met Met Met Ser Leu Thr 
20 25 30 

Lys Leu Ala Asp Lys Glu Leu Val His Met He Ser Trp Ala Lys Lys 
35 40 45 
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He Pro Gly Phe Val Glu Leu Ser Leu Phe Asp Gin Val Arg Leu Leu 
50 55 60 

Glu ser Cya Trp Met Glu Val Leu Met Met Gly Leu Met Ttp Arg Ser 
65 

lie A3P His pro Gly Lys Leu lie Phe Ala Pro Asp Leu Val Leu Asp 



85 



90 



95 



Arg Asp Glu Gly Lys Cys Val Glu Gly He Leu Glu He Phe Asp Met 

100 105 

Leu Leu Ala Thr Thr Ser Arg Phe Arg Glu Leu Lys Leu Gin Hi. Lys 
115 120 125 

Glu Tyr Leu Cys Val Lys Ala Met He Leu Leu Asn Ser Ser Met Tyr 
130 135 1« 



Pro Leu 
145 



Val Thr Ala Thr Gin Asp Ala Asp Ser Ser Arg Lys Leu Ala 

150 155 160 



His Leu Leu Asn Ala Val Thr Asp Ala Leu Val Trp Val He Ala Lys 

165 170 "5 

Ser Gly He Ser Ser Gin Gin Gin Ser Met Arg Leu Ala Asn Leu Leu 
180 185 190 

Met Leu Leu Ser His Val Arg His Ala Ser Asn Lys Gly Met Glu His 
195 200 2°5 



Leu Leu Asn Met Lys Cys Lys Asn Val Val Pro Val Tyr Asp Leu Leu 

210 



215 220 



Leu Glu Met Leu Asn Ala His Val Leu 
225 230 

(2) INFORMATION FOR SEQ ID NO: 5: 
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(i) SEQVSNCE CHARACTERISTICS: 

(A) LENGTH: 477 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unlcnown 

(ii) MOLECULE TYPE: protein 



ixi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met ABn Tyr Ser He Pro Ser A-n Val Thr Asn Leu Glu Gly GXy Pro 



10 15 



Asn Val Leu Trp Pro Thr Pro Gly His 
20 25 



Gly Arg Gin Thr Thr Ser Pro 

' 30 



Leu ser Pro L.« V.l Val Hi, Ar, Gin Leu Ser Hi. Le« Tyr Al. Glu 
35 <0 " 

Pro Gin Ly. Ser Pro Trp Cy. Glu Al. Arg Ser Leu Glu Hi. Thr Leu 

50 

pro val A.n Ar, Glu Thr Leu Ly. Arg Ly. Val Ser Gly A.n Arg Cy. 

70 '° 

65 '° 

Ma ser Pro Val Thr Gly Pro Gly Ser Lys Arg A.p Ala Hi. Phe Cy. 
85 90 

M, val cy. ser A.p Tyr Ala Ser Gly Tyr His Tyr Gly Val Trp Ser 

100 

cy. Glu Gly cy. Ly. Ala Phe Phe Lys Arg Ser lie Gin Gly Hi. Asn 

115 120 125 

ASP Tyr He Cy. Pro Ala Thr Asn Gin Cys Thr He Asp Ly. Asn Arg 

135 1*0 

130 '■•'^ 
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iv3 CVS Tyr Glu Val Gly 
^1 M« rw^ Ara I*eu Axg i^y^ ^ 
Ser Cys Gin Ala Cya Arg ^ 



150 



155 



Met Val Lys 



iv.a Glu Arg cys Gly Tyr Arg Leu Val 
Cys Gly Ser Arg Arg Glu Arg y 



165 



no 



Arg Arg Gin Arg 

180 



185 



190 



* « Ara Glu Leu Leu Leu Aap 
o ^ nv Glv His Ala Pro Arg Val Arg 
Lys Arg Ser Gly Giy nis 

195 



Ala Leu Ser 

210 



215 



220 



Tie ser Arg Pro Ser Ala Pro Phe Thr Glu Ala Ser 
Pro Hi. Val Leu He Ser Arg Pr 

230 

225 ^""^ 

^ ^ s„ - r. 

245 



250 



He Sor 



260 



265 



Met Glu Val Leu Met Met 
v«l Arq Leu Leu Glu Ser Cys Trp Met Giu 
Asp Gin Val Arg 

215 280 

, Aro ser lie A.p Hi, Fro Gly ty. Leu lie Phe Ala 
Gly Leu Met Trp Arg Ser ixe a»p 
295 

290 ^'"^ 



310 



315 



Pro Asp I^u 

305 

n. - - - - - - r,: "° •• 

325 



Leu Lys Leu 



- val LV3 Ala Met He Leu 
Gla His Lys Glu Tyr Leu Cys Val Lys Al 



340 



345 



55 
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.eu A.n Ser S.r Met Tyr Pro Leu Val Th. Ala Thr Oln A^p A.P 

365 

Ala Val Thr hap Ala Leu 



355 



ser ser Arg Lys Leu Ala His Leu Leu Asn 

375 

370 

VaX VaX Xle Ma .y. Se. .XV Xle Se. Se. OXn C.n C.n Se. Met 
385 390 395 

I.U Ala Asn I.u Leu Met Leu Leu S,. His Val Ar, Hi. AX. Ser 

405 

Glu His Leu Leu A»n Met Ly5 Cys Lys Asn Val Val 



425 



430 



Aan Lya Gly Met 
420 



435 



440 



445 



Oly cy- Ly. Sex Ser He Thr Gly Ser Glu Cy- Ser Pro Al. Glu A-p 
450 *55 *«0 

ser Lys 3« Ly5 Glu Gly Ser Gin Aan Pro Gin Ser Gin 



465 ^•'O 



(2) INTOBMATION FOR SEQ ID NO: 6: 

(!) SEQUENCE CHARACTERISTICS: 

(Al LENGTH: 416 amino acids 
(B) TYPE: amino acid 
(CI STRANDEDNESS: single 
(D) TOPOLOGY: unlcnown 

(ii) MOLECULE TYPE: protein 



475 



(xi) SEQUENCE DESCRIPTIOW: SEQ ID NO: 6: 
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10 



15 



20 



I^u Ser pro Leu Val Val Ar? 
35 

c-r Leu Glu His Thr l*u 

«. ™ r 

50 

T«« Val Ser Gly Aan Arg Cya 

70 



Pro 
65 



75 



. Glv ser Ly« Arg A-P Ala Hia Fhe Cy» 
Ma ser Pro Val Thr Gly Pro Gly 
85 

, « , Tvr Gly Val Trp Ser 
e i.,o Tvr Ala ser Gly Tyr His Tyr w.y 
Ala val Cy. Ser Asp Tyr ax 

100 

, arfl ser He Gin Gly His Asn 
^ Ala Phe Phe lys Arg ser xj- 

cys Glu Gly Cys Lys Ala pn 

115 

- Hii,^ Tl«i Asp I»y» ASB Arg 
. Pr^ Ala Thr Asn Gin Cys Thr He Asp 
A5P Tyr He Cys Pro Ala Tar 

130 



Arg Lys Set cys 



145 



, rva Tvr Glu Val Gly 
Gin Al. cy. AT, Leu Arg Ly. Cy» Tyr 

150 



»r« Glu Arg Cy3 Gly Tyr Arg Leu Val 
He. val Lys Cy. OlV -r Arg Arg alu Arg 



165 



Glu Gin Leu Hi. Cy. Ala Gly Ly. 



Arg Arg Gin Arg Ser Ala Asp — 
180 

. v,l Arq Glu Leu Leu Leu Asp 

196 



Ala Leu Ser 



Pro Glu Gin Leu 



val Leu Thr Leu Leu Glu Ala Glu Pro 
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210 

pro His 
225 



215 



220 

Phe Thr Glu Ala Ser 
240 



230 



. Lv» Glu Leu Val Hi« H*^ 
. Met Scr Leu Thr Ly. Uu M. A-P 

Met Met Met scr i« ^so 
245 



^ 2T0 



260 

Met Glu Val I*u Met 

GW Leu Met Ttp Mg 3er He 

^ 295 
290 

pro W Leu V.1 L.U MP Arg A-p 

310 

305 

, Tvr Thr sex M 

325 

«et Il« 

I^u Lya ""^^ 34S 
340 



^ .1. Tbi Gin MP Ala MP 
Leu Val Tht Ala « 



355 



ser ser Aig LY* 
370 



Ala His I^u teu A»n Ala 



3T5 



365 

val Thr ASP Ala Leu 

380 



val Trp val He Ala uy 
390 

305 

v*l Arq His Ala Arg 
X^u Leu ser His Val Arg 

410 

405 



I^u Ala A5nI*uLeuHet 
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10 



(A) LEMtrrH. 

,C, STW«I>«>««SS-- both 



20 



2S 



30 



35 



40 



45 



— 



29 



29 



£S 
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70 



15 



20 



25 



X»TOW^«0« FOR SEQ 

CHMACTBRISTICS: 

IC) STBAMDEWreSS. 
«DL=COLB type: C««^ 



40 



45 



22 



24 



50 



55 



24 



10 



15 
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24 



nucleic acid 
\a single 
jp, TOPOWGV: iine»^ 



20 



2S 



30 



35 



40 



45 



TVPB: nucleic acid 
TOPOUXJY: linear 
MOLECOI.E Tm- CDH^ 

^.20 base pai-J^* 
^°"^* l.c acid 



55 



(1) 



22 



20 



25 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(iil MOLECULE TYPE: cDMA 



20 



25 



30 



35 



40 



45 



SEQUENCE description: SEQ ID HO: 14: 
TCGCATGCCT GACGTGGGAC 
,2, INEORHATIOH FOR SEQ ID NO: 15: 

,i) SBQOEHCE CHARACTERISTICS: 

(A) LENGTH: 24 b*»« P«i" 

(B) TYPE: nucleic acid 

(C) STRAMDEDHESS: single 
<D) TOPOL0«s linear 

(it) MOIXCOIE TYPE: cDNA 

SEQUENCE DESCRIPTION: SBQ ID NO: 15 
GGCSTCCAGC ATCTCCAfiSA RCAG 
U) INTORKKTION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
IB) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
tD) TOPOLOGY: linear 

Ui) MOLECULE TYPE: cDNA 



55 
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w 



15 



20 



25 



30 



35 



SEQUENCE description: SEQ ID NO: X6 = 
(jjy^GGC TCACTTGCTG 
,2) IHEOFMATION FOR SEQ ID no: 17: 

,i> SEQUENCE CHWUVCTEMSTICS: 

(A) LENGTH: 20 base pair" 

(B) type: nucleic acid 
(C» STBANDEDNESS: sinql" 
(D) TOPOLOGY: linear 

,U) lOLECULB type: CDHA 
,,,, SEQUENCE DESCBimOH- SEQ ID N0:-^17: 

tcttgttctg gmagggmg 

,2) IHTOWXTIOH FOR SfQ ID HO: 18-. 

SBQOEHCE CHARMTEMSTICS: 

(A) LENGTH: 20 base pairs 

(B) type: nucleic acid 

(C) STBANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE type: cDNA 



,,,, SEQU=«CE description: SEQ ID NO: 18 
GCATGGAACA TCTGCTCAAC 
,2) INtX>RMATION SEQ ID NO: 19 = 
,i) SEQUENCE CHARACTERISTICS: 
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20 



25 



30 



-ED ID HO"' 



so r^CC AGGTTCJAW. 



21 
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420 



20 



25 



^C^CCA. 

aK.«««rra .t<;t«^«=' 

^ CrCCAG^CT T^CTO^ ^^'^''^ 

,.^TGcr ccwGCftwri acttosa- 



480 
540 
600 
660 
720 
780 
840 
900 
960 



45 
50 



55 



(3, «^no acid 

^„ sTBWOED^ESS: single 
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lie pro Ser M" 



V.1 Thr wn 



Tyr ser 

^ ^ Tro pro TUr pro Gly "13 

Ciy Arq Gin thr Tbt 

, . Hi» W» -^Vr 

rin Leu ser nts ^ 
Val Val His Arq Gin 
Leu set 40 



pro Le« 



pro Gin 55 

«^ AT, cy. 



Val h3TL Ar9 



Thr U.U .y. 



80 



75 



pro — ^0 

xra WP '^^ ^""^ 

val cy» ^^'^ 105 

100 

ser lie Gin Gly ^' ^" 

cy, Giu Gly cy» 

w Thr Asn Gin ^1" 

Tie cy» pro M» ^hr j^^q 
MP Tyr Il« ^35 

^« Arq Glu Ar, Cy» Gly Ty 
^ . Plv Set Ar9 1'=" 

net val W3 cy» <=^y no 

165 

His Cy8 Ala Gly W ^» 

Oln^r, ser Ala - — 
Arg Gl^ 
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180 



185 



190 



Val Arg 



Glu Leu Leu Leu Aap 



Glu Ala Glu pro 



195 



210 

. . Ala pro Pbe Thr Glu Ala ST 
Ti*. Set Arg Pro Ser Ala J?ro 
pro Hi5 val Leu Ue Set Arg 

*^ 230 

ser X.U T.r ^-P 



Met Ket 



245 



lie 3*r ^* ^ 265 



260 



Glu 



^« itet Glu V.1 teu Itet Hat 
9«r Cy« trp ^ 

285 



275 



260 



290 



pro ASP 

305 



310 



Leu 



^, ^ setr Arg Phe Arg Gl« 
* . xia Tbr Thr 3«r /w-** 

325 



335 

Ala Met He 



riu Tvr I^eu Cya Val Lys 
T Leu Gin His Lys Glu Tyr 
Leu Lya Leu 

340 

, val Thr Ala Thr Gla A«P Ala ASP 
set Met Tyr Pro Leu Val 



360 



Leu Asn Set 
355 

Ala val Thr A^P 
1.1. His Leu Leu Asn Aia 

ser ser Ar, I^Y^ 
370 

„ ser Gin Gin CI" ^er Met 



31 
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400 

395 

390 

385 

c r Kls Val Arg His Ala Arg 
, ;aa A.n Leu Leu Het Leu Leu Ser His 
Arg Leu Ala 

405 



Ser Ala 



(1) SBOUIWCE CHABACTEMSTICS: 

(A) LWGTH: 34 b«»e pai" 

(B) KPE: nucleic .cid 

(C) STBAMDEDMMS: single 
,D) TOPOLOGY: linear 

(U) MOLECUIX type: cDHA 

SEQUEHCe DESCMPTXOH: SEQ 10 KO: 22- 
T^^CCCTOCT AC^C 
,2, IHFOKHATIOH lOR « MO: 23: 
3EQUEHCE CHARACTEMSTICS: 

(A) I^NCTS' 33 »>»»' P*^" 

(B) TYPE: nucl*ic acid 

(C) STRMDEDHESS: •tn^la 

(D) TOPOLOGY: linear 

m, MOLECULE TYPE: cDHA 
,,,, SEQUENCE 0E3CRIPTI0K: SEQ 10 HO: 23 = 

c^.«;atcct cacctcagog ccaggcgtca aG 



55 
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(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARKClERISTICSt 

(A) LENGTH! 1698 base pai 

(B) TYPE: nucleic acid 

(C) STRAMDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: cDNA 



30 



3S 



(xi) SEQUENCE DBSCMPTIOM: SEQ ID NO: 24: 
..COAATCTT TOA^ACATT AXA««CCT TT«TCCC«:r TCTTCCAAO. TCmTCTCA 



CAAOAC^CC AXATA^AAAA CTCACC^CT A«:CXXAATT CTCCT«:crC 



CTACAACT^C A,^CA«CCA TCTTACCCCT C^GCACO^C TCCAXATACA TACCTXCCTC 
CTATC.A.AC AOCCACCAXC ^^C^ --ACATTC TA^CCTO CT«XOW«. 
„^ CCCACO^TO TCACTAACTT OC^^CO^ «-CACAA« 

CCCAAATCO TX^CAA CACaCCCA CCTTTCTCCT TTAOTOOTCC ATCCCCA.T, 
..CACATCTO TATOCC^AAC CTCAAAACA. TCCCT«..T CAA«:^^ COCX«=AACA 
O^CTTACCT GTAAACA^AG AOACACTSAA A^OGTT AGTGCGAACC 
eCCTOTTACT O.CCACCTT CAAACACOA T^CACTXC TCCCCTCTCT CCACC.ATTA 
CCC^TC^ — ATC CAOTCT«.C OTSTOAACOA T.TAA.«:CT TTTTTAAAAO 
^OCAXTCAA —0 ATTATATTT. TCCA.CTACA AATCA.- CAATC.ATAA 
^CCC^OC AA^OCIOCC A^ ^^^<^ ^ ^^^^ 



60 
120 
180 
240 
300 
360 
420 

GTTGCGCCAG 480 



540 
*600 
660 
720 
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GAAGTGTGGC TCCCG<3AGAG AGAGATGTGG GTACCGCCTT GTGCGGAGAC AGAGAAGTGC 780 

5 

CGACGAGCAG CTGCACTGTG CCGGCAAGGC CAAGAGAAGT GGCGGCCACG CGCCCCGAGT 840 

GCGGGAGCTG CTGCTGGACG CCCTGAGCCC CGAGCAGCTA GTGCTCACCC TCCTGGAGGC 900 

w 

TGAGCCGCCC CATGTGCTGA TCAGCCGCCC CAGTGCGCCC TTCACCGAGG CCTCCATGAT 960 

GATGTCCCTG ACCAAGTTGG CCGACAAGGA GTTGGTACAC ATGATCAGCT GGGCCAAGAA 1020 

15 

GATTCCCGGC TTTGTGGAGC TCAGCCTGTT CGACCAAGTG CGGCTCTTGG AGAGCTGITG 1080 

20 GATGGAGGTG TTAATGATGG GGCTGATGTG GCGCTCAATT GACCACCCCG GCAAGCTCAT 1140 

CTTTGCTCCA GATCTTGTTC TGGACAGGGA TGAGGGGAAA TGCGTAGAAG GAATTCTGGA 1200 

AATCTTTGAC ATGCTCXTGG CAACTACTTC AACGTTTCGA GAGTTAWiAC TCCAACACM 1260 

AGWVTATCTC TGTGTCAAGG CCATGATCCT GCTCAATTCC AGTATCTACC CTCTGGTCAC 1320 

30 

AGCGftCCCAG GATGCTGMA GCAGCCGGWV GCTGGCTCAC TTGCTGAACG CCGTGACCGh 1380 

TGCTTTGGTT TGGGTGATTG CCAAGAGCGG CATCTCCTCC CAGCAGCAAT CCATGCGCCT 1440 

35 

GGCTAACCTC CTGATGCTCC TGTCCCACGT CAGGCATGCG AGTAACAAGG GCATGGAACA 1500 

TCTGCTCAAC ATGAAGTGCA AAAATGTGGT CCCAGTGTAT GACCTGCTGC TGGAGATGCT 1560 

40 

GAATGCCCAC GTGCTTCGCG GGTGCAAGTC CTCCATCACG GGGTCCGAGT GCAGCCCGGC 1620 

45 AGAGGACAGT AAAAGCAAAG AGGGCTCCCA GAACCCACAG TCTCAGTGAC GCCTGGCCCT 1680 

GAGGTGAACT GGCCCACAGA GGTCACAAGC TGAAGCGTGA ACTCCAGTGT GTCAGGAfiCC 1740 

TGGGCTTCAT CTTTCTGCTG TGTGGTCCCT CATTTGGTGA TGGCAGGCTT GGTCATGTAC 1800 

CATCCTTCCC TCCACCTTCC CAACTCTCAG GAGTCGGTGT GAGGAAGCCA TAGTTTCCCT 1860 

55 
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TGTTAGCAGA GGGACATTTG AATCGAGCGT TTCCACAC 
(2) INrORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 530 amino acida 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 25: 

Met A«p He Ly« Aki Ser Pro Ser Ser Leu A«n Ser Pro Ser Ser Tyr 

15 10 15 

Aan Cya Ser Gin Ser lie Leu Pro Leu Glu Hia Gly Ser He Tyr He 

20 25 30 , 

Pro Ser Ser Tyr Val Aap Ser Hia Hia Glu Tyr Pro Ala Met Thr Phe 
35 40 

Tyr Ser Pro Ala Val Met Aan Tyr Ser He Pro Ser Aan Val Thr Aan 
50 55 60 

Leu Glu Gly Gly Pro Gly Arg Gin Thr Thr Ser Pro Aan Val Leu Trp 
65 70 75 80 

Pro Thr Pro Gly Hia Leu Ser Pro Leu Val Val His Arg Gin Leu Ser 
85 90 ^5 

Hia Leu Tyr Ala Glu Pro Gin Lya Ser Pro Trp Cya Glu Ala Arg Ser 
lOO 105 110 



Leu 



Glu Hia Thr Leu Pro Val Aan Arg Glu Thr Leu Lya Arg Lya Val 
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115 



120 



125 



Ser Gly Aan Arg Cya Ala Ser Pro Val Thr Gly Pro Gly Ser Lys Arg 
130 135 140 

A3P Ala His Phe Cys Ala Val Cya Ser Asp Tyr Ala Ser Gly Tyr His 
145 150 155 160 

Tyr Gly Val Trp Ser Cys Glu Gly Cys Lys Ala Phe Phe Lys Arg Ser 
165 170 1''5 

He Gin Gly His Asn Asp Tyr He Cys Pro Ala Thr Asn Gin Cys Thr 
180 185 190 

lie Asp Lys Asn Arg Arg Lys Ser Cys Gin Ala Cys Arg Leu Arg Lys 

195 200 205 

Cys Tyr Glu Val Gly Met Val Lys Cy5 Gly Ser Arg Arg Glu Arg Cya 
210 215 220 

Gly Tyr Arg Leu Val Arg Arg Gin Arg Ser Ala Aap Glu Glo Leu Hia 
225 230 235 240 

Cys Ala Gly Lya Ala Lys Arg Ser Gly Gly Hi. Ala Pro Arg Val Arg 
245 250 255 

Glu Leu Leu Leu Asp Ala Leu Ser Pro Glu Gin Leu Val Leu Thr Leu 
260 265 270 

Leu Glu Ala Glu Pro Pro His Val Leu He Ser Arg Pro Ser Ala Pro 
275 280 265 

Phe Thr Glu Ala Ser Met Met Met Ser Leu Thr Lys Leu Ala Asp Lys 
290 295 300 



Glu Leu Val His Met He Ser Trp Ala Lys Lys He Pro Gly Phe Val 

305 310 315 320 

Glu Leu Ser Leu Phe Asp Gin Val Arg Leu Leu Glu Ser Cys Trp Met 
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325 



330 



335 



Glu Val Leu Met Met Giy Leu Met Trp Arg Ser lie Asp His Pro Gly 
340 345 350 

Lys Leu lie Phe Ala Pro Asp Leu Val Leu Asp Arg Asp Glu Gly Lya 
355 360 365 

Cys Val Glu Gly lie Leu Glu lie Phe A5p Met Leu Leu Ala Thr Thr 
370 375 380 

Ser Arg Phe Arg Glu Leu Lys Leu Gin Hia Lys Glu Tyr Leu Cys Val 
385 390 395 400 

Lys Ala Met lie Leu Leu Asn Ser Ser Met Tyr Pro Leu Val Thr Ala 
405 410 415 

Thr Gin Asp Ala Asp Ser Ser Arg Lys Leu Ala His Leu Leu Asn Ala 
420 425 430 

Val Thr Asp Ala Leu Val Trp Val lie Ala Lys Ser Gly lie Ser Ser 
435 440 445 

Gin Gin Gin Ser Met Arg Leu Ala Asn Leu Leu Met Leu Leu Ser His 
450 455 460 

Val Arg Hia Ala Ser Asn Lya Gly Met Glu His Leu Leu Asn Met Lys 
465 470 475 480 

Cys Lys Asn Val Val Pro Val Tyr Aap Leu Leu Leu Glu Met Leu Asn 
485 490 495 

Ala His Val Leu Arg Gly Cys Lys Ser Ser lie Thr Gly Ser Glu Cys 
500 505 510 

Ser Pro Ala Glu Asp Ser Lys Ser Lys Glu Gly Ser Gin Asn Pro Gin 
515 520 525 



Ser Gin 
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530 

5 (2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 

(ii) MOLECULE TYPE: other nucleic acid 



20 

(xi) SEQUENCE DESCRIPTKHf: SBQ ID NO: 26: 

25 

GTGCGGATCC TCTCAAGACA TGGATATAAA 30 
(2) INFORMATION EX>R SBQ ID NO: 27: 

30 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 



45 

(Xi) SEQUENCE DESCRIPTIWi: SEQ ID HO: 27: 
AGTAACAGGG CTGGCGCAAC GGTTC '25 

50 

(2) INFORMATION FOR SEQ ID NO: 28: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOIiOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
ACTGGCGATG GACCACTAAA GG 22 



Claims 



1 Isolated DN A encoding a protein having an N-terminal domain, a DN A-binding donnain and a ligand-binding domain, 
wherein the ammo acid sequence of said DNA-binding domain of said protein exhibits at least 80% homology with 
the ammo acid sequence shown in SEQ ID N0:3, and the amino acid sequence of said ligand-binding domam of 
said protein exhibits at least 70% homology with the amino acid sequence shown in SEQ ID NO:4, 

2 Isolated DNA according to claims 1 , characterized in that the amino acid sequence of said DNA-binding domain 
of said protein exhibits at least 90%, preferably 95%, more preferably 93%, most preferably 100% homology with 
the amino acid sequence shown in SEQ ID N0:3. 

3 Isolated DNA according to claims 1 or 2, characterized in that the amino acid sequence of said ligand-binding 
domain of said protein exhibits at least 75%, preferably 80%, more preferably 90%, nnost preferably 1 00% homology 
with the amino acid sequence shown in SEQ ID NO:4. 

4. Isolated DNA according to claims 1 to 3, said DNA encoding a protein comprising the amino acid sequence of 
SEQ ID NO:5, SEQ ID N0:6, SEQ ID NO:21 or SEQ ID NO:25. 

5. Isolated DNA according to claims 1 to 4, characterized in that said DNA comprises the nucleic acid sequence of 
SEQ ID NO:1 , SEQ ID NO:2, SEQ ID NO:20 or SEQ ID NO:24. 

6. A recombinant expression vector comprising the DNA according to any of the claims 1 to 5. 

7. A cell iransfected with DNA according to claims 1 lo 5 or an expression vector according to claim 6. 

8. A cell according to claim 7 which is a stable transfected cell line which expresses the steroid receptor protein 
according to any of the claims 9 to 1 1 . 

9. Protein encoded by DNA according to claims 1 to 5 or an expression vector according to claim 6. 

10. Protem according to claim 9, said protein comprising the amino acid sequence of SEQ ID NO:5, SEQ ID NO:5, 
SEQ ID NO:21 or SEQ ID NO:25. 

11. Chimeric protein having an N-terminal domain, a DNA-binding domain, and a ligand-binding domam, 
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characterized in that at least one of said dorriains of said chimeric protein originates from a protein according 
to claims q or 1 0 and at least one of the other domains of said chimeric protein originates from another receptor 
protein from the nuclear receptor supertamily, provided that the DNA-binding domam and the ligand-binding domain 
of said chimeric protein originates from difterent proteins, 

12. DNA encoding a protein according to claim 11. 

1 3 Use of a DNA according to claims 1 to 5 or 1 2, an expression vector according to claim 6, a cell according to claim 
7 or 8 or a protein according to claim 9 to 11 in a screening assay for identification of new drugs. 

14. A method for identifying functional ligands for the protein according to claims 9 to 11 . said method comprising the 
steps of 

a) inlroducing into a suitable host cell 1 ) DNA according to claims 1 to 5 orl 2. and 2) a suitable reporter gene 
functionally linked to an operative hormone response element, said HRE being able to be activated by the 
DNA-binding domain of the protein encoded by said DNA; 

b) bnnging the host cell from step a) into contact with potential ligands which will possibly bind to the ligand- 
binding domain of the protein encoded by said DNA from step a); 

c) monitoring the expression of the protein encoded by said reporter gene of step a). 
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Fig. 2 



42 



EP 0 798 378 A2 



Transient transfection of CHO cells with Estrogen Receptors Alpha and Beta 
Incubation with Estradiol and ICI 



20 




no hormone E2 10-9 E2 10-9 + ICl 10-6 
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ERg and F R p RT PGR on tissue-representative cell lines 



ERa 




1. Ishikawa 

2. HEC-IA 

3. RL95-2 

4. ECC-] 

5. SaOS-2 



6. HOS 

7. U2-0S 

8. MG-63 

9. MCF-7 
1().T47-D 



11. HS-760T 

12. SW-954 

13. Hep- G2 

14. CaCo 

15. HISVI 



16. HUV-EC-C 

17. BAEC-1 

18. AlO 

19. A7R5 

20. CavaSMC 



21- RASMC 
bl. blank 



Figure ^ 
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